Skip to content

Conversation

@nicktindall
Copy link
Contributor

@nicktindall nicktindall commented Jul 31, 2025

Add a count of undesired shard allocations broken down by shard routing role.

This is so that we can check whether there are undesired shard allocations in a specific tier in serverless. We don't have an explicit concept of tier (search/indexing) in the code, but shard routing role is a good enough proxy for that.

Relates: ES-12221

@elasticsearchmachine elasticsearchmachine added v9.2.0 serverless-linked Added by automation, don't add manually labels Jul 31, 2025
@nicktindall nicktindall requested a review from pxsalehi July 31, 2025 07:31
@nicktindall nicktindall added >non-issue :Distributed Coordination/Allocation All issues relating to the decision making around placing a shard (both master logic & on the nodes) labels Jul 31, 2025
@nicktindall nicktindall removed the serverless-linked Added by automation, don't add manually label Aug 1, 2025
@nicktindall nicktindall changed the title Breakdown undesired allocations by tier Breakdown undesired allocations by shard routing role Aug 1, 2025
@nicktindall nicktindall marked this pull request as ready for review August 1, 2025 05:29
@elasticsearchmachine
Copy link
Collaborator

Pinging @elastic/es-distributed-coordination (Team:Distributed Coordination)

@elasticsearchmachine elasticsearchmachine added the Team:Distributed Coordination Meta label for Distributed Coordination team label Aug 1, 2025
long unassignedShards,
long totalAllocations,
long undesiredAllocationsExcludingShuttingDownNodes,
Map<ShardRouting.Role, Long> undesiredAllocationsExcludingShuttingDownNodesByRole
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If we're going to collect it all in a map of roles, then shouldn't it also replace totalAllocations and undesiredAllocationsExcludingShuttingDownNodes since those would be the values that the "default" role would have in stateful? but then you get into what to do for the values returned from desired balance API in serverless, and there you'd have to sum up index_only and search_only I guess, and sprinkle a bunch of asserts since index/search and default should be mutually exclusive in this map. Another option is to just add the specific break downs we need here? index/searchTierAllocations and index/searchTierUndesiredAllocations? I think you'd need both total and undesired since we're going to need the ratio in the autoscaler.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

also how do we get these two stats (index/TierAllocations and indexTierUndesiredAllocations?) out of the balancer in the autoscaler? Is there a getter for these?

Copy link
Contributor Author

@nicktindall nicktindall Aug 11, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Another option is to just add the specific break downs we need here? index/searchTierAllocations and index/searchTierUndesiredAllocations?

As far as I can tell we have no concept of "tier" in the ES codebase. There is role, but we seem to stop short of defining that as equivalent to a tier. I don't quite understand why but I don't want to make assumptions about what role means in regards to tiers (see co.elastic.elasticsearch.stateless.allocation.StatelessAllocationDecider#canAllocateShardToNode, it's not even done there)

if we make two specific fields, we're baking in the assumption that

  • Role.INDEX_ONLY means indexing tier, Role.SEARCH_ONLY means search tier and Role.DEFAULT means the only tier in a stateful deployment
  • the existing set of roles are fixed

So we'd probably need to add assertions to trigger if these assumptions were no longer valid. I think having a map of role to counts (possibly Role -> record RoleStats(int total, int undesired)) at least leaves the interpretation of the roles to the context they're used in.

Maybe these are valid assumptions? Perhaps there is some history regarding the modelling?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

also how do we get these two stats (index/TierAllocations and indexTierUndesiredAllocations?) out of the balancer in the autoscaler? Is there a getter for these?

I added a getter and put up a serverless PR to illustrate that flow

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

As far as I can tell we have no concept of "tier" in the ES codebase. There is role, but we seem to stop short of defining that as equivalent to a tier

:))) it feels like we're saying the same thing. index/searchTierAllocations and index/searchTierUndesiredAllocations are not good names, I agree with you. I was suggesting instead of the map, we could just add two more variables for total allocation and undesired allocation of index_only (or could be called promotable shards), since we don't need the rest. that's the only part of that map, that we use for now for the specific ticket that initiated this change. we could also use the map and ignore the rest, if you prefer, that's fine.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think I prefer the map. It's a breakdown of those stats by node role. In this code we never cared about node role previously, and we should continue to not care.

If we pick out specific roles to populate fields we're baking in knowledge of the roles and their meaning in this code, I'd rather we tried to keep that knowledge in the stateless code, which is the only place we should see anything other than DEFAULT, and the only place we're using the per-role stats.

@elasticsearchmachine elasticsearchmachine added the serverless-linked Added by automation, don't add manually label Aug 11, 2025
@nicktindall nicktindall requested a review from pxsalehi August 11, 2025 07:46
* @param unassignedShards Shards that are not assigned to any node.
* @param allocationStatsByRole A breakdown of the allocations stats by {@link ShardRouting.Role}
*/
public record AllocationStats(long unassignedShards, Map<ShardRouting.Role, RoleAllocationStats> allocationStatsByRole) {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

if you're going with replacing totals with the role-based map, doesn't the current version implicitly assume default and search_only/index_only are exclusive? Shouldn't we assert that either the counts for default are empty or for both search_only/index_only, so that when this assumption is not valid totalAllocations() and undesiredAllocationsExcludingShuttingDownNodes() wouldn't silently return something wrong? That's what I was saying in my previous comment.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

No, as per #132235 (comment) this code was previously agnostic of role and should continue to be so.

There's parts of the code that care about these stats broken down by role (in stateless) and parts that just want the totals (here).

long unassignedShards,
long totalAllocations,
long undesiredAllocationsExcludingShuttingDownNodes,
Map<ShardRouting.Role, Long> undesiredAllocationsExcludingShuttingDownNodesByRole
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

As far as I can tell we have no concept of "tier" in the ES codebase. There is role, but we seem to stop short of defining that as equivalent to a tier

:))) it feels like we're saying the same thing. index/searchTierAllocations and index/searchTierUndesiredAllocations are not good names, I agree with you. I was suggesting instead of the map, we could just add two more variables for total allocation and undesired allocation of index_only (or could be called promotable shards), since we don't need the rest. that's the only part of that map, that we use for now for the specific ticket that initiated this change. we could also use the map and ignore the rest, if you prefer, that's fine.

@nicktindall nicktindall requested a review from pxsalehi August 14, 2025 02:23
@nicktindall nicktindall merged commit c9b45d8 into elastic:main Aug 14, 2025
33 checks passed
szybia added a commit to szybia/elasticsearch that referenced this pull request Aug 15, 2025
* upstream/main: (32 commits)
  Speed up loading keyword fields with index sorts (elastic#132950)
  Mute org.elasticsearch.index.mapper.LongFieldMapperTests testSyntheticSourceWithTranslogSnapshot elastic#132964
  Simplify EsqlSession (elastic#132848)
  Implement WriteLoadConstraintDecider#canAllocate (elastic#132041)
  Mute org.elasticsearch.test.rest.yaml.CcsCommonYamlTestSuiteIT test {p0=search/400_synthetic_source/_doc_count} elastic#132965
  Switch to PR-based benchmark pipeline defined in ES repo (elastic#132941)
  Breakdown undesired allocations by shard routing role (elastic#132235)
  Implement v_magnitude function (elastic#132765)
  Introduce execution location marker for better handling of remote/local compatibility (elastic#132205)
  Mute org.elasticsearch.cluster.ClusterInfoServiceIT testMaxQueueLatenciesInClusterInfo elastic#132957
  Unmuting simulate index data stream mapping overrides yaml rest test (elastic#132946)
  Remove CrossClusterCancellationIT.createLocalIndex() (elastic#132952)
  Mute org.elasticsearch.index.mapper.LongFieldMapperTests testFetch elastic#132956
  Fix failing UT by adding a required capability (elastic#132947)
  Precompute the BitsetCacheKey hashCode (elastic#132875)
  Adding simulate ingest effective mapping (elastic#132833)
  Mute org.elasticsearch.index.mapper.LongFieldMapperTests testFetchMany elastic#132948
  Rename skipping logic to remove hard link to skip_unavailable (elastic#132861)
  Store ignored source in unique stored fields per entry (elastic#132142)
  Add random tests with match_only_text multi-field (elastic#132380)
  ...
joshua-adams-1 pushed a commit to joshua-adams-1/elasticsearch that referenced this pull request Aug 15, 2025
In order that we can prevent scale-down in stateless when there are undesired allocations specifically in the indexing tier

Closes: ES-12221

Co-authored-by: Pooya Salehi <[email protected]>
szybia added a commit to szybia/elasticsearch that referenced this pull request Aug 15, 2025
…-stats

* upstream/main: (36 commits)
  Fix reproducability of builds against Java EA versions (elastic#132847)
  Speed up loading keyword fields with index sorts (elastic#132950)
  Mute org.elasticsearch.index.mapper.LongFieldMapperTests testSyntheticSourceWithTranslogSnapshot elastic#132964
  Simplify EsqlSession (elastic#132848)
  Implement WriteLoadConstraintDecider#canAllocate (elastic#132041)
  Mute org.elasticsearch.test.rest.yaml.CcsCommonYamlTestSuiteIT test {p0=search/400_synthetic_source/_doc_count} elastic#132965
  Switch to PR-based benchmark pipeline defined in ES repo (elastic#132941)
  Breakdown undesired allocations by shard routing role (elastic#132235)
  Implement v_magnitude function (elastic#132765)
  Introduce execution location marker for better handling of remote/local compatibility (elastic#132205)
  Mute org.elasticsearch.cluster.ClusterInfoServiceIT testMaxQueueLatenciesInClusterInfo elastic#132957
  Unmuting simulate index data stream mapping overrides yaml rest test (elastic#132946)
  Remove CrossClusterCancellationIT.createLocalIndex() (elastic#132952)
  Mute org.elasticsearch.index.mapper.LongFieldMapperTests testFetch elastic#132956
  Fix failing UT by adding a required capability (elastic#132947)
  Precompute the BitsetCacheKey hashCode (elastic#132875)
  Adding simulate ingest effective mapping (elastic#132833)
  Mute org.elasticsearch.index.mapper.LongFieldMapperTests testFetchMany elastic#132948
  Rename skipping logic to remove hard link to skip_unavailable (elastic#132861)
  Store ignored source in unique stored fields per entry (elastic#132142)
  ...
@nicktindall nicktindall deleted the undesired_by_tier branch August 21, 2025 01:39
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

:Distributed Coordination/Allocation All issues relating to the decision making around placing a shard (both master logic & on the nodes) >non-issue serverless-linked Added by automation, don't add manually Team:Distributed Coordination Meta label for Distributed Coordination team v9.2.0

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants